智能论文笔记

Extracting Medication Changes in Clinical Narratives using Pre-trained Language Models

Giridhar Kaushik Ramachandran , Kevin Lybarger , Yaya Liu , Diwakar Mahajan , Jennifer J. Liang , Ching-Huei Tsou , Meliha Yetisgen , Özlem Uzuner

分类：自然语言处理

2022-08-17

对于医疗保健提供者提供适当的患者护理的准确和详细说明，包括患者时间表中的药物变化，至关重要。医疗保健提供者或患者本身可能会引发患者药物的改变。用药更改采用多种形式，包括处方药和相关剂量修饰。这些更改提供了有关患者整体健康以及导致当前护理的理由的信息。然后，未来的护理可以基于患者的最终状态。这项工作探讨了从自由文本临床注释中自动提取药物变化信息。上下文药物事件数据集（CMED）是临床注释的语料库，其注释可以通过多种变化相关的属性来表征药物变化，包括更改的类型（启动，停止，增加等），更改，时间性，时间性，时间性，时间性，时间性，时间。改变可能性和否定。使用CMED，我们确定了临床文本中的药物提及，并提出了三个新型的基于BERT的新型基于BERT的系统，以解决注释的药物变化特征。我们证明，我们建议的体系结构改善了对CMED的初始工作改善药物变更分类的性能。我们确定了0.959 F1的高性能的药物提及，我们提出的系统将药物变化及其属性分类为0.827 F1。

translated by 谷歌翻译

AdverSAR: Adversarial Search and Rescue via Multi-Agent Reinforcement Learning

Aowabin Rahman , Arnab Bhattacharya , Thiagarajan Ramachandran , Sayak Mukherjee , Himanshu Sharma , Ted Fujimoto , Samrat Chatterjee

分类：机器人 | 机器学习

2022-12-20

Search and Rescue (SAR) missions in remote environments often employ autonomous multi-robot systems that learn, plan, and execute a combination of local single-robot control actions, group primitives, and global mission-oriented coordination and collaboration. Often, SAR coordination strategies are manually designed by human experts who can remotely control the multi-robot system and enable semi-autonomous operations. However, in remote environments where connectivity is limited and human intervention is often not possible, decentralized collaboration strategies are needed for fully-autonomous operations. Nevertheless, decentralized coordination may be ineffective in adversarial environments due to sensor noise, actuation faults, or manipulation of inter-agent communication data. In this paper, we propose an algorithmic approach based on adversarial multi-agent reinforcement learning (MARL) that allows robots to efficiently coordinate their strategies in the presence of adversarial inter-agent communications. In our setup, the objective of the multi-robot team is to discover targets strategically in an obstacle-strewn geographical area by minimizing the average time needed to find the targets. It is assumed that the robots have no prior knowledge of the target locations, and they can interact with only a subset of neighboring robots at any time. Based on the centralized training with decentralized execution (CTDE) paradigm in MARL, we utilize a hierarchical meta-learning framework to learn dynamic team-coordination modalities and discover emergent team behavior under complex cooperative-competitive scenarios. The effectiveness of our approach is demonstrated on a collection of prototype grid-world environments with different specifications of benign and adversarial agents, target locations, and agent rewards.

translated by 谷歌翻译

Acela: Predictable Datacenter-level Maintenance Job Scheduling

Yi Ding , Aijia Gao , Thibaud Ryden , Kaushik Mitra , Sukumar Kalmanje , Yanai Golany , Michael Carbin , Henry Hoffmann

分类：机器学习

2022-12-10

Datacenter operators ensure fair and regular server maintenance by using automated processes to schedule maintenance jobs to complete within a strict time budget. Automating this scheduling problem is challenging because maintenance job duration varies based on both job type and hardware. While it is tempting to use prior machine learning techniques for predicting job duration, we find that the structure of the maintenance job scheduling problem creates a unique challenge. In particular, we show that prior machine learning methods that produce the lowest error predictions do not produce the best scheduling outcomes due to asymmetric costs. Specifically, underpredicting maintenance job duration has results in more servers being taken offline and longer server downtime than overpredicting maintenance job duration. The system cost of underprediction is much larger than that of overprediction. We present Acela, a machine learning system for predicting maintenance job duration, which uses quantile regression to bias duration predictions toward overprediction. We integrate Acela into a maintenance job scheduler and evaluate it on datasets from large-scale, production datacenters. Compared to machine learning based predictors from prior work, Acela reduces the number of servers that are taken offline by 1.87-4.28X, and reduces the server offline time by 1.40-2.80X.

translated by 谷歌翻译

On Solution Functions of Optimization: Universal Approximation and Covering Number Bounds

Ming Jin , Vanshaj Khattar , Harshal Kaushik , Bilgehan Sel , Ruoxi Jia

分类：机器学习

2022-12-02

We study the expressibility and learnability of convex optimization solution functions and their multi-layer architectural extension. The main results are: \emph{(1)} the class of solution functions of linear programming (LP) and quadratic programming (QP) is a universal approximant for the $C^k$ smooth model class or some restricted Sobolev space, and we characterize the rate-distortion, \emph{(2)} the approximation power is investigated through a viewpoint of regression error, where information about the target function is provided in terms of data observations, \emph{(3)} compositionality in the form of a deep architecture with optimization as a layer is shown to reconstruct some basic functions used in numerical analysis without error, which implies that \emph{(4)} a substantial reduction in rate-distortion can be achieved with a universal network architecture, and \emph{(5)} we discuss the statistical bounds of empirical covering numbers for LP/QP, as well as a generic optimization problem (possibly nonconvex) by exploiting tame geometry. Our results provide the \emph{first rigorous analysis of the approximation and learning-theoretic properties of solution functions} with implications for algorithmic design and performance guarantees.

translated by 谷歌翻译

Scalable Pathogen Detection from Next Generation DNA Sequencing with Deep Learning

Sai Narayanan , Sathyanarayanan N. Aakur , Priyadharsini Ramamurthy , Arunkumar Bagavathi , Vishalini Ramnath , Akhilesh Ramachandran

分类：机器学习

2022-11-30

Next-generation sequencing technologies have enhanced the scope of Internet-of-Things (IoT) to include genomics for personalized medicine through the increased availability of an abundance of genome data collected from heterogeneous sources at a reduced cost. Given the sheer magnitude of the collected data and the significant challenges offered by the presence of highly similar genomic structure across species, there is a need for robust, scalable analysis platforms to extract actionable knowledge such as the presence of potentially zoonotic pathogens. The emergence of zoonotic diseases from novel pathogens, such as the influenza virus in 1918 and SARS-CoV-2 in 2019 that can jump species barriers and lead to pandemic underscores the need for scalable metagenome analysis. In this work, we propose MG2Vec, a deep learning-based solution that uses the transformer network as its backbone, to learn robust features from raw metagenome sequences for downstream biomedical tasks such as targeted and generalized pathogen detection. Extensive experiments on four increasingly challenging, yet realistic diagnostic settings, show that the proposed approach can help detect pathogens from uncurated, real-world clinical samples with minimal human supervision in the form of labels. Further, we demonstrate that the learned representations can generalize to completely unrelated pathogens across diseases and species for large-scale metagenome analysis. We provide a comprehensive evaluation of a novel representation learning framework for metagenome-based disease diagnostics with deep learning and provide a way forward for extracting and using robust vector representations from low-cost next generation sequencing to develop generalizable diagnostic tools.

translated by 谷歌翻译

Towards Realistic Underwater Dataset Generation and Color Restoration

Neham Jain , Gopi Matta , Kaushik Mitra

分类：计算机视觉

2022-11-27

Recovery of true color from underwater images is an ill-posed problem. This is because the wide-band attenuation coefficients for the RGB color channels depend on object range, reflectance, etc. which are difficult to model. Also, there is backscattering due to suspended particles in water. Thus, most existing deep-learning based color restoration methods, which are trained on synthetic underwater datasets, do not perform well on real underwater data. This can be attributed to the fact that synthetic data cannot accurately represent real conditions. To address this issue, we use an image to image translation network to bridge the gap between the synthetic and real domains by translating images from synthetic underwater domain to real underwater domain. Using this multimodal domain adaptation technique, we create a dataset that can capture a diverse array of underwater conditions. We then train a simple but effective CNN based network on our domain adapted dataset to perform color restoration. Code and pre-trained models can be accessed at https://github.com/nehamjain10/TRUDGCR

translated by 谷歌翻译

Hardware/Software co-design with ADC-Less In-memory Computing Hardware for Spiking Neural Networks

Marco Paul E. Apolinario , Adarsh Kumar Kosta , Utkarsh Saxena , Kaushik Roy

分类：神经与进化计算

2022-11-03

Spiking Neural Networks (SNNs) are bio-plausible models that hold great potential for realizing energy-efficient implementations of sequential tasks on resource-constrained edge devices. However, commercial edge platforms based on standard GPUs are not optimized to deploy SNNs, resulting in high energy and latency. While analog In-Memory Computing (IMC) platforms can serve as energy-efficient inference engines, they are accursed by the immense energy, latency, and area requirements of high-precision ADCs (HP-ADC), overshadowing the benefits of in-memory computations. We propose a hardware/software co-design methodology to deploy SNNs into an ADC-Less IMC architecture using sense-amplifiers as 1-bit ADCs replacing conventional HP-ADCs and alleviating the above issues. Our proposed framework incurs minimal accuracy degradation by performing hardware-aware training and is able to scale beyond simple image classification tasks to more complex sequential regression tasks. Experiments on complex tasks of optical flow estimation and gesture recognition show that progressively increasing the hardware awareness during SNN training allows the model to adapt and learn the errors due to the non-idealities associated with ADC-Less IMC. Also, the proposed ADC-Less IMC offers significant energy and latency improvements, $2-7\times$ and $8.9-24.6\times$, respectively, depending on the SNN model and the workload, compared to HP-ADC IMC.

translated by 谷歌翻译

Neighborhood Gradient Clustering: An Efficient Decentralized Learning Method for Non-IID Data Distributions

Sai Aparna Aketi , Sangamesh Kodge , Kaushik Roy

分类：机器学习

2022-09-28

分散的学习算法可以通过在不同设备和位置生成的大型分布式数据集对深度学习模型进行培训，而无需中央服务器。在实际情况下，分布式数据集可以在整个代理之间具有显着不同的数据分布。当前的最新分散算法主要假设数据分布是独立且分布相同的（IID）。本文的重点是用最小的计算和内存开销来改善非IID数据分布的分散学习。我们提出了邻居梯度聚类（NGC），这是一种新型的分散学习算法，使用自我和交叉梯度信息修改每个代理的局部梯度。特别是，所提出的方法用自级的加权平均值，模型变化的跨梯度（接收到的邻居模型参数相对于本地数据集的衍生物）和数据变化，将模型的局部梯度取代了模型变化的均值平均值交叉梯度（相对于其邻居数据集的本地模型的衍生物）。此外，我们提出了compngc，这是NGC的压缩版本，通过压缩交叉梯度将通信开销降低了$ 32 \ times $。我们证明了所提出的技术在各种模型体系结构和图形拓扑上采样的非IID数据分布上提出的技术的经验收敛性和效率。我们的实验表明，NGC和COMPNGC的表现优于现有的最先进的（SOTA）去中心化学习算法，而不是非IID数据的$ 1-5 \％$，其计算和内存需求明显降低。此外，我们还表明，所提出的NGC方法的表现优于$ 5-40 \％$，而没有其他交流。

translated by 谷歌翻译

SGTM 2.0: Autonomously Untangling Long Cables using Interactive Perception

Kaushik Shivakumar , Vainavi Viswanath , Anrui Gu , Yahav Avigal , Justin Kerr , Jeffrey Ichnowski , Richard Cheng , Thomas Kollar , Ken Goldberg

分类：机器人 | 人工智能 | 机器学习

2022-09-27

电缆在房屋，医院和工业仓库中很普遍，容易纠结。本文通过引入新颖的不确定性定量指标和与电缆相互作用以减少感知不确定性相互作用的新型不确定性定量指标和动作，扩展了对自动释放长电缆的先前工作。我们为Tangle操纵2.0（SGTM 2.0）提供了滑动和握力，该系统使用双边机器人自动解开大约3米长的电缆，并使用每个步骤的不确定性估算值估计，以告知动作。通过互动降低不确定性，缠结操作2.0（SGTM 2.0）的滑动和握住可以减少其必须采用的状态排列动作的数量，从而大大加快运行时间。实验表明，SGTM 2.0可以在1或2台上和图8节的电缆上取得83％的脱节成功，并且在这些配置中的70％终止检测成功，在无障碍精度上优于SGTM 1.0，超过43％，在全部推出速度上超过200％。可以在sites.google.com/view/sgtm2上找到补充材料，可视化和视频。

translated by 谷歌翻译

DEFT: Diverse Ensembles for Fast Transfer in Reinforcement Learning

Simeon Adebola , Satvik Sharma , Kaushik Shivakumar

分类：机器学习 | 人工智能

2022-09-26

已经证明，深层合奏将典型的集体学习中看到的积极效果扩展到神经网络和增强学习（RL）。但是，要提高此类整体模型的效率仍然有很多事情要做。在这项工作中，我们介绍了在RL（feft）中快速传输的各种合奏，这是一种基于合奏的新方法，用于在高度多模式环境中进行增强学习，并改善了转移到看不见的环境。该算法分为两个主要阶段：合奏成员的培训，以及合成成员的合成（或微调）成员，以在新环境中起作用。该算法的第一阶段涉及并行培训常规的政策梯度或参与者 - 批评者，但增加了鼓励这些政策彼此不同的损失。这会导致单个单峰剂探索最佳策略的空间，并捕获与单个参与者相比，捕获环境的多模式的更多。 DEFT的第二阶段涉及将组件策略综合为新的策略，该策略以两种方式之一在修改的环境中效果很好。为了评估DEFT的性能，我们从近端策略优化（PPO）算法的基本版本开始，并通过faft的修改将其扩展。我们的结果表明，预处理阶段可有效地在多模式环境中产生各种策略。除了替代方案，faft通常会收敛到高奖励的速度要快得多，例如随机初始化而无需faft和合奏成员的微调。虽然当然还有更多的工作来分析理论上的熟练并将其扩展为更强大，但我们认为，它为在环境中捕获多模式的框架提供了一个强大的框架，同时仍将使用简单策略表示的RL方法。

translated by 谷歌翻译